-
Notifications
You must be signed in to change notification settings - Fork 919
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support strip
, lstrip
, and rstrip
in strings_udf
#12091
Support strip
, lstrip
, and rstrip
in strings_udf
#12091
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some minor suggestions for improvement. Feel free to merge once you've addressed, I don't need to look again unless something new comes up (in which case feel free to re-request).
@@ -229,6 +229,7 @@ def resolve_count(self, mod): | |||
"isnumeric", | |||
"istitle", | |||
] | |||
string_binary_funcs = ["strip", "lstrip", "rstrip"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This name is somewhat misleading. There are other string binary functions that we already have implemented, including operators and things like find
and contains
. Is this list meant to contain binary string functions that also return a string, or is it even more specific than that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Renaming this to string_return_attrs
Are tests failing because there's somewhere extra that needs to have the new method attributes added?
|
I think this is just because those tests need to be marked |
Co-authored-by: Vyas Ramasubramani <[email protected]>
@davidwendt I'm getting one failing test locally for the following string using: >>> import cudf
>>> sr = cudf.Series(['.123a'])
>>> def f(st):
... return st.lstrip('a')
...
>>> sr.apply(f)
0 .123
dtype: object It looks like this >>> import cudf
>>> sr = cudf.Series(['.123a'])
>>> def f(st):
... return st.rstrip('.')
...
>>> sr.apply(f)
0 123a
dtype: object
>>> Here it looks like this was |
I'm not seeing this error in my unit tests
Could the problem be that all of these are calling just cudf/python/strings_udf/strings_udf/lowering.py Lines 59 to 67 in 302fe60
|
@davidwendt you're completely right, with 02167d3 the tests pass. |
Codecov ReportBase: 88.05% // Head: 88.08% // Increases project coverage by
Additional details and impacted files@@ Coverage Diff @@
## branch-22.12 #12091 +/- ##
================================================
+ Coverage 88.05% 88.08% +0.02%
================================================
Files 135 135
Lines 22057 22092 +35
================================================
+ Hits 19423 19459 +36
+ Misses 2634 2633 -1
Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. ☔ View full report at Codecov. |
@gpucibot merge |
This PR adds support for the following three functions in
strings_udf
:str.strip(other)
str.lstrip(other)
str.rstrip(other)
Part of #9639